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ABSTRACT 

Comparing clustering of differently biased tracers of the dark matter distribution of- 
fers the opportunity to reduce the sample or cosmic variance error in the measurement 
of certain cosmological parameters. We develop a formalism that includes bias non- 
linearities and stochasticity. Our formalism is general enough that can be used to 
optimise survey design and tracers selection and optimally split (or combine) trac- 
ers to minimise the error on the cosmologically interesting quantities. Our approach 



generalises the one presented by McDonald & Seljak (2009) of circumventing sample 



variance in the measurement of / = din D/ din a. We analyse how the bias, the noise, 
the non-linearity and stochasticity affect the measurements of Df and explore in which 
signal-to-noise regime it is significantly advantageous to split a galaxy sample in two 
differently-biased tracers. We use N-body simulations to find realistic values for the 
parameters describing the bias properties of dark matter haloes of different masses 
and their number density. We find that, even if dark matter haloes could be used as 
tracers and selected in an idealised way, for realistic haloes, the sample variance limit 
can be reduced only by up to a factor <T2tr/'^itr — 0.6. This would still correspond 
to the gain from a three times larger survey volume if the two tracers were not to be 
split. Before any practical application one should bear in mind that these findings ap- 
ply to dark matter haloes as tracers, while realistic surveys would select galaxies: the 
galaxy-host halo relation is likely to introduce extra stochasticity, which may reduce 
the gain further. 

Key words: cosmology: large-scale structure of Universe — cosmology: theory — 
cosmology: cosmological parameters 



1 INTRODUCTION 



One of the active topics of current research is the formation and growth of large-scale structure in the universe. Knowledge 
of the physical origin of the growth of structure will allow us to know about the origin of dark matter and also provide a 
useful way to discriminate between different theories for the origin and evolution of dark energy. In particular, comparing 
and combining measurements of the Universe expansion history (as given by e.g., Baryon Acoustic Oscillations, cosmic 
chronometers, Supernovae) with measurements of the linear growth of structure, can provide a tool to test whether dark 
energy is an extra component with negative pressure or a manifestation of the breakdown of general relativity on large scales. 
To this end, usually, the goal is to measure the f{z) parameter defined as f{z) = din D{z) / din a{z) , where D{z) is the linear 
growth factor and a{z) the scale factor. 



The two main approaches to measure the growth of structure are weak gravitational lensing (e.g., Hoekstra et al. (20021 



Bacon et al. (20051) and galaxy clustering, which is the technique we consider here. Galaxy clustering is a relatively simple, 



high signal-to-noise measurement: the angular position of galaxies can be measured using photometry, and the radial position 
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using spectroscopy. With this information the three-dimensional power spectrum of galaxies can be computed as a function 
of redshift. However, at the two-point level, the galaxy field can only trace the dark matter field up to a bias factor b, which 
may depend on scale and redshift. Thus, only once the bias is known, the galaxy power spectrum Pgg can be related to that 
of the dark matter Pmm and thus yield the growth of structure. The main drawback associated to this technique is that the 
value of this bias cannot readily be predicted from theoretical models of galaxy formation. 

There are several observational techniques to measure the bias parameter (Fry 1994 Feldman et al.pOOl Verde et al. 
2002[ [Seljak et aL]|2005[ [Hawkins et al.,.2003[ [Guzzo et aL]|2008[ ). In this work we use the approach that takes advantage of 
the redshift-space distortions. Peculiar velocities of dark matter tracers are set by the gravitational field: using the measured 
redshift as a distance indicator distorts clustering, enhancing it along the line-of-sight, and the redshift space distortion 
parameter is /3 = //& (Davis & Peebles 1983 Kaiser 1987 Hamilton 19981. Using measurements in different directions 



(different Fourier modes), one can compute /?. In combination with the galaxy power spectrum measurement, this approach 
yields the divergence power spectrum Pee = f^Pmm, which can be directly compared with theory predictions and encloses 
the desired dependence on the growth of structure. 

There are two sources of errors in the measurement of the galaxy power spectrum: the shot noise and the sample variance 
(or cosmic variance). On one hand, the shot noise is due to the fact that we use a discrete set of objects to characterise 
the matter field. If this noise is Poisson, it is scale independent and equal to the reciprocal of the number density of objects 
( Peebles! 1980 1. On the other hand, the sample variance effect is due to the fact that the matter field has its origin in a random 
realisation of the underlying cosmology. In a finite survey volume there are only a finite number of modes present, especially 
on large scales we only have a few modes to perform the averaging. Thus, the total error on the power spectrum P at a given 
scale k, is ap/P = (1 + an/P). Here, A*' is the number of modes measured (at the scale given by k) and (t„ is the 

shot noise contribution. We see that just reducing the shot noise (increasing the number density of objects) does not help, as 
there is a natural limitation on our capacity to measure P (and consequently f{z)), due to sample variance. 

In order to reduce this limitation, a multi-tracer technique has been advocated recently ( Seljak|2009 1. It is based on the 
usage of two differently biased tracers of the dark matter field. With this method the sample variance limit can be reduced. 
The effectiveness of this method depends on a number of factors: the ratio of these different biases; the signal-to-noise regime 
and on the non-linearity of the biases. With the exception of gravitational lensing, one cannot see the dark matter directly 
nor the dark matter haloes, so in most practical applications, tracers need to be used such as galaxies, quasars, clusters, or 
21 cm emission. 

The goal of this paper is to study the possibility of measuring the parameter x{z) = f^{z)D'^{z) using the single- and the 
multi-tracer formalism and see whether the reduction of the sample variance is significant. Here, we present a new formalism 
of how to estimate the error on x using the multi-tracer formalism, taking into account that the bias may be scale dependent, 
non-linear and stochastic. This formalism may be useful for galaxy surveys, because it has been observed that the galaxy 
biasing is significantly non-linear and stochastic. N-body simulations and theoretical models, allow us to estimate which are 
the bias characteristics for dark matter tracers, and therefore which precision can be reached with this model. 

In §2 we begin by introducing the formalism of our method and analyse how the different parameters affect the reduction 
of the sample variance effect. In §3 we use both analytical approximations and N-body simulations to obtain physically 
motivated parameters for our model and compute realistic expected errors using the single- and multi-tracer formalism. In §4 
we conclude with a summary and a discussion of the results. 



2 METHOD 

We start with the basic assumptions that the tracer (galaxies or haloes) number density is given by a Poisson sampling of 
an underlying continuous field ng(x), with overdensity defined by 5g(x) = ng(x)/ng — 1, and that the galaxy overdensity 
field is related to the mass overdensity field 5 by a conditional probability distribution P{5g\5), including a stochastic element 
which we will describe later. In addition, in realistic surveys using the redshift as distance indicator, peculiar velocities distort 
clustering in a manner dependent on the angle with respect to the line of sight. In particular clustering is enhanced -at least 
on large scales, in the linear regime- along the line of sight. It is the angular dependence of the effect that yields the signal 
to extract a measurement of the growth of structure. In this paper, our main goal is to estimate the error of the growth 
rate of perturbations, f{z), generalising the method proposed by McDonald & Seljak (20091 to measure f{z), reducing the 
sample variance limit and using redshift-space distortions. The idea is to split the sample of objects into two sub-samples 
with different biasing properties. In their work they used a linear bias model and all stochasticity was due to shot noise. Here 
we generalise their work to the non-linear bias case and take into account the possibility of having off-diagonal noise terms, 
which can be introduced, for example if there are objects in common in the two samples. In this regime, we compare the 



^ This model may be inadequate in details when dealing with real haloes |Smith et al.|2007| [Seljak et al.|2009| . We will return to this 
point in §3. 
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multi-tracer with the single-tracer approach and we analyse how the noise and non-linearities affect the extent to which the 
sample variance limit can be improved. We also use results from simulations to set plausible values for these parameters to 
see how great the gains may be in practice. 



2.1 Modelling of bias 

The relation between clustering properties of the dark matter and those of the tracer (haloes or galaxies) goes under the name 
of "bias". The simplest, nontrivial bias model is linear bias, (5g(x) — &i(5(x), with 61 constant and independent of position and 
scale. This corresponds to a deterministic linear biasing, which has little physical motivation and is problematic if bi > 1 and 
the field is not very linear, since it allows an unphysical (5g < — 1 in voids. A more complex relation is almost certainly needed 
to properly describe galaxy clustering. Non-linear biasing with a bias which is no longer a constant but a function of 5{x) is 
a common way to improve the model. 

Here we adopt the formalism proposed by |Dekel &: Lahav] ( |1999[ ). We assume that both 5(x) and 5g(x) are random fields 
with one-point probability distributions functions, P{S{x)) and P(5g(x)), with zero mean ((5(x)) = {5g{x.)) = and variances 
(5^(x)) and (5g(x)) respectively. 

We first define the mean biasing function, 6[(5(x)], as the conditional mean between the galaxy and the matter field, 

&[5(x)]5(x) = (5,(x)15(x)) = I d5g(x)P(5g(x)|5(x))5g(x). (1) 

This is the natural generalisation of the deterministic linear biasing relation, where the function 6[5(x)] characterises the 
non-linear bias behaviour. Note that P{Sg\S) can have a width (i.e a scatter) around the mean relation, b{S), which is however 
not captured by the function b{S). We characterise the function b{5) by the first- and second-order moments, which are given 
by b{r) and b{r) at zero lag (i.e. r = 0), 

- _ {3(x + r)g(x)b[<?(x)]) 
(5(x)5(x + r)) 

72,. _ (^(x + r)^(x)b[g(x)]6[^(x + r)]) 
Wx)5(x + r)) 

where { ) represents the averaging over the volume of the survey or over different realisations. These two parameters take 
into account the non-linearity of the system as long as one is concerned with the two-point correlation function (or the power 
spectrum) and not higher-order correlations. It is useful to define their ratio as 

Rir) ^ (4) 

which is a useful parameter to measure the non-linearity of the bias. In the linear case, b{r) = b{r) and R(r) = 1 whereas for 
non-linear cases R{r) < 1. Note that b{r) is the bias as it would appear in the tracer-dark matter cross correlation while (r) 
would appear in the tracer auto-correlation. 

We next turn to stochasticity, by which we mean any physical or statistical process that produces a non-deterministic 
relation between the dark matter and the galaxy (or halo) field. This may arise from the discrete nature of galaxies, in which 
case it is called shot noise; if it is a Poisson process, its expression is inversely proportional to the mean density of objects, 
1/ng, but the formalism used here allows for other stochastic processes which are encoded in the width of P{5g\5)). 

In order to study the stochasticity of the bias, we define the random bias field e(x) as the difference between the galaxy 
field and the dark matter field once biased by the mean bias relation 6[5(x)], 

e(x) = 5g(x)-&[5(x)]5(x). (5) 

If P{5g\5) is a uni-variate Gaussian then b{5) and cr|(5), the variance of P{5g\5) at a given 5, completely specifies P{5g\5). 
The variance of the e field is given by the average of (Jh{5) over 5. 

In general, once 5, 5g and e are defined, the corresponding correlation functions are 



^mm (^ 



{5(x)5(x + r)) (6) 

(5g(x)5(x + r)) (7) 

(5g(x)5g(x + r)) (8) 

(e(x)6(x + r)) (9) 

(e(x)5(x + r)). (10) 



In what follows we are only interested in the two-point correlation function or the power spectrum, thus we do not need 
to specify further moments of P{5g\S) or higher-order correlations. 
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Table 1. Different sets of parameters of the non-linear model given by Eq. [island the corresponding values of b, b and R. 
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Figure 1. Different biasing models. Left panel: set 1 (solid line), set 2 (dashed line) and set 3 (dotted line). Centre panel: set 4 (solid 
line), set 5 (dashed line) and set 6 (dotted line). Right panel, set 5 with stochasticity component. Sets are detailed in Tablell] 



In order to give an illustrative example to the bias formalism let us consider a simple non-linear bias model given by 

6((5) = bo + &i<5 + fe2 5' (11) 

Let us also assume that this bias model is non-stochastic. Therefore according to Eq. [Sjwe have that the galaxy overdensity 
must be 

5, = bo5 + biS"" + biS'-" (12) 

In order to deal with the simplest scenario we assume that 5 is a gaussian random field. This means that the n-point correlation 
function, (5"), can be expressed as a function of the two-point correlation function, {<5^). 

We have stated above that Sg field has to satisfy (Sg) = 0. Provided that {5) = {S"^) = we have that bi must be null: 

b{S) = bo + b25^ (13) 

The biasing parameters given by Eq. [2] and |3] are 

b = bo + 3b2{5^} (14) 

P = bl + 6bob2{S^) + 15bl{SY (15) 

where we have used that (5*) = 3(5^)^ and (S^) = 15(5^)^ if 5 is gaussian. 

As an illustrative example, we consider this simple biasing model with different set of parameters, bo and 62, Hsted in 
Table 111 Set 1, set 2 and set 3 are linear biasing models (62 = 0), and therefore b = b and R = 1. These models are plotted 
in Fig. IT] (left panel) with solid, dashed and dotted lines respectively. On the other hand, set 4, set 5 and set 6 are non- linear 
biasing models (62 7^ 0) plotted in Fig. [l] (central panel) with solid, dashed and dotted lines respectively. In these cases is 
clear that b b and therefore R < 1. Note also that _R is a good indicator of how non-linear the model is: the more different 
R is from 1 the more non-linear b{5) is. 

Finally, the stochasticity can easily be included in this formalism just adding a random field, namely e, in Eq. |12[ The 
resulting of doing this with set 5 is shown in Fig. [l] (right panel). 



2.2 Redshift-space distortions of biased tracers 

So far we have defined the bias model in configuration space. However, to deal with redshift space distortions, it is more 
convenient to work with the galaxy density function 5g in fc-space. The basic relation for redshift distortions is, assuming the 
distant observer approximation ( Kaiser|1987 1 

si{k) = Sg{k) + ft,^{k)m (16) 

where 5g(k) is the Fourier transform of the overdensity of galajcies, 5g(k) its redshift-space counterpart and (5(k) the real-space 
transform of the overdensity of dark matter, ^(fc) is the cosine of the angle between the line of sight and k. Throughout this 
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section we consider a single snapshot at a fixed redshift, usually we take z = 0. Therefore, we write, for instance, / instead of 
f{z) and so on. If we consider that we have two dark matter tracers, then: 



5J,(k) =5<„(k) + /^*2(fc)5(k); i = l,2, 

with a corresponding covariance matrix 

'"■^ > [ (5|i(k)<5^/(k)) {5^2(k)5|2'(k)) 

We define various power spectra as 
(5,4k)5;,.(k)) = P,,:,,(fc) 

{5(k)5;,(k)) EE p„,,(fc) 

Note that these quantities are related to Eqs. |6|8| through their Fourier transforms, 

P{k) = / d^rCWe'""""- 



The terms of the covariance matrix of Eq. 18 can be expressed a^ 



Pgigjik) _|_ ^^2 Pmgijk) + Pmgjjk) ^ f'^yf' 



(17) 
(18) 

(19) 
(20) 
(21) 

(22) 
(23) 



_Pmm{k^ Pmmi^k^ 

In this paper, we will assume that the cross terms between the random field and the matter field are sub-dominant and can 
be assumed to be zero: 

{e(x)B[5(x + r)]) = (24) 
where B[5] is any function of S. If we relate these quantities to the bias parameters defined in Eqs. [2]and[3| we obtain that, 



(fc) 

Pmgi (^) 
Pgigiik) 
P3l92{k) 



r^gig2(r)e 



d^rbi{r)^mrn{r)e 



d3r(5(x)6i[5(x)]5(r + x)62[<5(r + x)])e-*-- + j d'r C.i., (Oe"*" 



d^ri?i2(r)6i(r)b2(r)e™™(r)e-*''"-+ / d^re,,,,(r)e 



-ik-r 



where the parameter Rrzi'^) is a new non-hnearity parameter between tracers of type 1 and 2, which is defined as, 

R (bi[^(x)lJ(x)b2[^(x + r)]^(x + r)) 

' - {&i[5(x)]<5(x)6i[5(x + r)]5(x + r))V2(i,2[5(x)]5(x)62[5(x + r)]5(x + r))i/2- 

For convenience we define new bias parameters, 
/d3rb.(r)U™(r)e-'''-- _ P^,,{k) 



h{k) 

bl(k) 

R^{k) 
Ri2{k) 



ik) 

PgigiiM^ P^icii.^^ 



/d^rb?(r)C^^(r)e-"' _ 

/ d^rC„™(r-)e-*-- P^mik) 

bii^k^ Pmgi{k') 

Hk) ^ {P,™(fc)[Pg,g.(fc)-P,,«(fc)]}'/'' 

/d='rPi2(r)6i(r)62(r)C,„„e-* 



Pgig'zik) - P.iMk) 



[/d3rfe2(r)^,„„(r)e-*--]l/2[Jrf3rb2(^)^^^^(^)g-ik.r]l/2 [P^ujlik) - P,ui{k)]'/^ [Pg2g2{k) - P,2e2{k)]'/ 

and the e field power spectrum, 

-ik-r 



(25) 
(26) 
(27) 
(28) 
(29) 

(30) 

(31) 
(32) 
(33) 

172 (34) 



Peiej (k) 



J d'rU^ 



j(r)e" 



(35) 



In this expression the factor of 2 difference from that appearing in |McDonaId fc Seljak| | |2009| is due to a different definition of (5i,: we 
consider that 5^ is complex. 
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Thus the covariance matrix reads, 



with 



Cii(fc,At) = P„„(fc) [6?(fc) + 2/^''bi(fc)i?i(fc) + /V*] +P.i.i(fc) (37) 

Ci2(A:,^) = P™„(fc){i?i2(fc)bi(fc)fe2(fc) + /M' [6i(fc)i?i(fc) + b2(fc)i?2(fc)] +MV'}+J'.i.2(fc) (38) 
C22(A:,m) = P„„(fc) [6i(fc) + 2//i"62(fc)i?2(fc) + /V*] +P.2.,(fc). (39) 

Because 6i(fc) are parameters which cannot be obtained readily from observations, it is preferable to work with the redshift- 
space distortion parameter /3i(fc) = f/bi{k). Also Pm.m{k,z) at a given z is not directly measurable. However, in the linear 
regime we can write it as Pmm{k, z) = {z)P^^{k), where P,5i^(fe) is the fiducial power spectrum and D{z) is the linear 
growth factor. Thus, defining x{z) = D^{z)f'^{z) the covariance matrix elements are: 

Cn(fc,/i) = xP°„(fc) [/3r'(fc) + 2^i^/3r'(fc)Pi(fc)+/] +J'.i.i(fe) (40) 
Ci2(fc,/i) = xP!^^{k)[Ri2{k)pr\k)l32\k)+fj.''[l3^\k)Ri{k)+l3^'{k)R2{k)]+^i^}+P,,,,{k^ (41) 

C22(fc,M) = sP°™(fc) [/32-'(fc) + 2^i^/32"^(fc)i?2(fc)+/] +J'.2.2(fc)- (42) 

_Note that the quantity x encompasses all the relevant cosmological information about the growth of structure. From Eqs. 
one can see that the covariance matrix can easily be separated in two parts: a signal part S and a noise contribution 



40 



42 



N: 

C2tr{k,fx) = S{k,fi) + N{k) (43) 

where the noise matrix is 

iV,,(fc) = P<,<,(fc). (44) 

If the two tracers are both a Poisson sample of the dark matter field and do not overla p then Nij{k) is diagonal, scale- 
independent and its elements are TVn = 1/ni; A*'22 = 1/^2. This is the case considered by McDonald & Seljak (20091. Any 
other source of stochasticity would add to the discreteness effect and, in general, may yield non-zero off-diagonal contributions 
iVi2 / 0. 

If we can estimate the noise part and Pmm(^) is given e.g., by CMB observations, then the covariance matrix depends 
on six parametric functions: x, Pi{k), P2{k), Ri{k), R2{k) and Pi2(fc). 
Considering one dark matter tracer the covariance matrix is simpler: 

Citr{k, m) = xP^^ik) [r\k) + 2/i'/3~'(fc)P(fc) + fi*^ + P,4k). (45) 

and depends on only x, I3{k) and R{k). 

An interesting point is the 'hidden' relation between the different variables. For instance, given /3i(fc), /32(fc) and the 
relative number of objects of these two tracers, /3(fc) is constrained. Also, in the two-tracer case, given Ri{k) and R2{k), 
Ri2{k) is also constrained. We give these relations in appendices A and B. 



2.3 Forecasting errors 



We use the Fisher matrix formalism ( Fisher |1935 \ to estimate errors on x. When the means of the data are fixed (i.e. for a 
given fiducial model) , the Fisher matrix is given by ( [Tegmark et al.|1997[ ) 



(46) 



where C,a = dC/d\, C is the covariance matrix and A the parameters of the model. The marginalised variance of parameter 
A is given by 

c^l = (P-')aa . (47) 
Following e.g., [Feldman et al.| ( |1994| ) for a survey volume Vu, in the continuum approximation. 



PAA'(k)d"k 



(48) 
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Figure 2. Left panel: ax/x vs. S/N. Right panel: a-^*'"/(Tj*'' vs. S/N. In both panels 61 = 1 and (from bottom to top) 62 = 1/3 (orange 
line), 1/2 (green line), 1 (black line), 2 (red line) and 3 (blue line). For the combined sample according to Eg. |A8| the total bias factors 
are: 6 = 2/3 (orange line), b = 3/4 (green line), b = 1 (black line), 6 = 3/2 (red line) and b = 2 (blue line). In the left panel the solid 
lines represent the errors for two-tracer case, the dashed lines for one tracer. 



which we evaluate by a discrete sum 

P'^y^T^ E E ^kF,y{k,f^)k\ (49) 

In this paper we assume that the fiducial power spectrum, the noise matrix and the R parameters are known and we fix them 
to their fiducial values. We will explore the dependence of the results on the assumed fiducial values in the following sections. 
The lambda parameters are 61, 62 and x or, equivalently, Pi, /32 and x for the two-tracers case; when reporting errors on x we 
marginalise over /3i and p2- For the one tracer case the parameters are b and x and the reported errors on x are marginalised 
over 6. 

The specific values of kmin and k^ax depend on the features of the survey, kmin being set by the survey volume and ky^ax 
is usually set by the onset of non-linearities. In our case we set kmin ~ and conservatively set kmax = 0.1Mpc~^/i 

for z = 0. 

In the next section we will compare the errors of x, obtained using the one- and the two-tracer approaches. To produce 
the figures we assume we have a single snapshot at jz = 0, which corresponds to / = 0.483 in a standard ACDM universe, 
the power spectrum is given by CAMB ( Lewis et al.||200^ l, the sampling volume is set to be Vu = l{Gpc/h)^ and all biases 
and all R coefficients are taken to be scale-independent. The relative number of tracers is y = ni/n2, and the signal-to-noise 
ratio is S/N = rP'\k = 0.1h/Mpc)n. Note that the signal-to-noise is defined relative to the combined sample of tracers. The 



relation between b and &i , 62 is given in Eq. A8 Since the fractional cosmic- variance error on the power spectrum (in a shell in 
Fourier space) is constant with redshift, the quantities reported below are valid at different z provided that the signal-to-noise 
and the various bias parameters are defined at the redshift of interest. Note however that the fiducial value of x (and that of 
/) change with redshift: x increases with redshift up z — 0.5 and decreases for larger z while / increases with redshift tending 
to 1 asymptotically. We find that the dependence of the fractional error, Uxjx, and of the ratio of errors, a^"^ ja]^'^ , on the 
value of X is weak. More importantly, the kmax at which non-linearities become important is expected to depend on redshift 
and to increase roughly as (1 -I- z). The number of independent modes A'^ in a given volume grows roughly like fcf, 
variance scales like 1/A''. 



2.4 Dependence on bias 

Given that the forecasted error on x depends on many variables we will start by considering the effects of one variable at a 
time. The first important effect to be analysed is how the bias of the tracers (absolute and relative) affects the measurements 
of X (and therefore of /). For simplicity, we assume linear biasing (i.e., all R parameters equal 1) and the same number of 
objects for two distinct populations {Y — 1) with diagonal Poisson-like noise. 

In Fig. [2] we show ax/x (left panel) and Ux^^/ax^^ (right panel) vs. S/N. The two-tracer case indicates that the two 
tracers are treated separately yielding a covariance matrix as in Eqs. |36f|42| The one-tracer case indicates that both tracers 



are included in the same sample and the covariance matrix is given by Eq. 45 In both panels we have set 61 = 1 (i.e. tracer 



1 is unbiased) and 62 ranges from 0.33 to 3: &2 = 1/3 (orange line), 1/2 (green line), 1 (black line), 2 (red line), 3 (blue line). 



The bias factors for the combined sample according to Eq. AS are & = 2/3 (orange line), 6 = 3/4 (green line), & = 1 (black 



line), b = 3/2 (red line) and 6 = 2 (blue line). In the left panel the solid lines correspond to the two-tracer case, the dashed 
lines to the one-tracer case. 

From Fig. d (left panel) we see that for the single-tracer case (dashed lines), the lower the combined bias b, the lower 
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the error a^. This can be understood if we recall that to measure / we are using the redshift space distortion effect. Since 



this effect is proportional to 1/6, the lower the bias, the larger the redshift space distortions. As noted in McDonald & Seljak 



(20091, this means that, slightly counter-intuitively, low-bias objects may act as useful tracers, even if they are used as a 
single population. On the other hand, for the two-tracer case (solid lines) we can see that, i) the improvement (compared to 
the one-tracer case) is significant only when the signal dominates (log[S'/A'^] > 0.7), and ii) the improvement increases as the 
difference in the biases of the two populations increases, also as noted by [McDonald fc Seljak| ( 2009 1. Note that McDonald & 



Seljak] ( |2009J measure the normalisation of Pee (with the shape given by external obscrvables) which is a different quantity 



from X considered here. However the fractional forecasted errors in the two quantity are the same, ctx/x = apg^/Pee as well 
as the relative two-tracer vs. one-tracer improvement cr'^^ /a]}'^ = crp^'^ /ap^'^ . 

Fig. [2] (right panel) further quantifies the effect. The improvement between the two cases depends on the ratio of biases 
and is only significant if the S/N is large enough. In particular, for a bias ratio of 3 (blue and orange lines), the improvement 
is significant {a^^ /a]!^^ ^ 0.5) for log[5'/A'^] ^ 1.3. However, for ratios of 2 (red and green lines), the improvement starts 
to be significant only when log[5'/A'^] 1.5. Note also that for the special case 62 = 61 (black line), as expected, there 
is no improvement. When comparing these results -especially Fig[2] (left panel)- with those of McDonald & Seljak (20091 
one should keep in mind that they define the signal-to-noise at fc = 0.4/i/Mpc while we use k = 0.1/i/Mpc, and that 
P{k = 0.1/i/Mpc) ~ 12P(fc = 0.4/i/Mpc). With this in mind, we reproduce their results. 

We also conclude that for surveys with a low S/N {\og[S/N] < 0.7), it is better to have a low-bias tracer; splitting the 
sample and using two tracers will not yield significant improvement. For example, for a bias of 6 = 0.75 (green dashed line 
m left panel of Fig. |2| we obtain fractional errors of a^/x ~ 0.08 for a log[S/A'^] — 0.7. On the other hand, for surveys with 
higher S/N (log[5'/A'^] > 0.7), it is better to use the two-tracer case, choosing two tracers with the highest possible bias ratio. 
In this case, for biases of 61 = 1 and &2 = 0.5 (green solid line in left panel of Fig. [2|, we reach a^/x ~ 0.04 for log[S/A*'] = 1.5. 
In practice, of course, the choice of which tracers to use is complicated by their number density; a high bias may be desirable, 
but one probably pays a penalty through low density and high shot noise. 



2.5 Effect of bias non-linearities 

The second interesting issue is to consider how non-linearities in the bias (i.e. the R parameters) affect a^- In this case we fix 
foi = 1 and 62 = 2 with the same number of objects for each tracer {Y=l) and Poisson noise. In Fig [3] and [i] (left panels) we 
show how al/^ /x (dashed lines) and a^^"" /x (solid lines) vary with S/N. In both cases the black line is for the perfect linear 
bias case R\ = R2 = R12 = 1 and colour lines show different non- linear cases (see Fig. |3] and [4] captions for details). In Fig. [s] 
and [4] (right panels) we show how the ratio a'^^^/a^/^ varies with S/N. 

In general, we see from Fig. [3] and |4] (left panels) that the two-tracer case is more sensitive to non-linear bias effects 
than the one-tracer case for high S/N ratios (log[5'/Af] ^ 1.2). In particular, in Fig. |3j for the single-tracer case a deviation 
from unity of Ri produces a slight reduction of the error which is the same for all the S/N range explored. This is due to a 



reduction of the combined bias b: as we reduce R12 (because we set _Ri = R12), b is reduced (see Eq. A7l and therefore the 
total error is also reduced as we have seen in section [2^ On the other hand, for the two-tracer case there are two opposite 
behaviours depending on the value of S/N: for logfS'/A''] > 1.2, we observe that non-linearities produce an increase in the 
error, whereas for log[5'/A'^] < 1.2 they produce a reduction. In the high S/N regime reducing _Ri reduces the fi^ coefficient in 



Eq. 40 and thus reduces the angular dependence. In the low S/N regime this is compensated by the fact that the off-diagonal 
terms of the covariance matrix are reduced by non-linearities (recall that the noise off-diagonal terms are set to zero here). 
While this may not be clear at first sight from Eq. \41\ we have verified it numerically. 

From Fig. [3] (right panel) we see that the non-linear bias increases the improvement between the two approaches for 
\og[S/N] < 1.2, and hmits it for \og[S/N] > 1.2. In particular, non-linear bias with Ri = 0.9 and i?2 = 1 gives (T^"'/(Ti"' ~ 0.6 
for log[S/N] — 1.5, whereas for the perfect linear bias case it is a^"^ /a^/"^ ~ 0.5. At lower signal-to-noise, namely log[S'/A'^] = 0.7, 
we obtain for the same non-linear bias case, /al/'^ ~ 0.85, and for the linear bias case, 

o-f/cfl/"' ^ 0.9. Therefore, non- 
linearities affect the two-tracer approach considerably more than the single-tracer approach. Non-linearities in the mean bias 
relation slightly increase the precision of the x measurement for low signal-to-noise regime (log[S/A'^] < 1.2), but they limit 
the effectiveness of the two-tracer approach for high signal-to-noise (log[S/A'^] > 1.2). 

On the other hand, from Fig. [4] we observe a very similar behaviour. In this case we have set Ri — R2 = 0.9 and we 
change the value of R12. First of all we observe that the one-tracer case is not very sensitive to non-linearities in this range 
of Rs. The small changes for one tracer are mainly due to the change of the combined bias as we have noted above for Fig. 
[3] The second point is that little deviations from Ri_2 = 1.0 produce an increasing on the fractional error for the two-tracer 
case for \og[S/N\ 1.2, as it can be seen in Fig. [4] (left panel). Also in Fig. [4] (right panel) we observe that the ratio of errors 
increases quickly for log[5'/A''] ^ 1.2 as we leave R12 = 1.00 

In summary, is important to note (Fig. [2j [s] |4| that the two-tracer method yields a substantial improvement compared 
to the one-tracer approach only in the high signal-to-noise regime and if the non-linearity parameters R are close to unity. 
Even if i?i is 0.9, with i?2 = 1 or i?i = i?2 = 0.9 with R\2 — 0.95, the gain saturates at about a factor of two. In section 
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Figure 3. Left panel: Fractional error cr^/x vs. S/N. Dashed lines represent the single-tracer case and solid lines the two-tracer case. 



S/N. In both panels the colours show different non-linear cases: Ri = 1.00 (black solid line), Ri = 0.99 (red 



Right panel: erf /(ji'^ 

solid line), Ri = 0.90 (blue solid line) and Ri = 0.80 (green solid line); R2 = 1 and R12 = ^^l- For dashed lines, R is the corresponding 
value for the full sample given the above values for R2 and R12 as in Eg. |A4| R = 1.000 (blacli dashed line), R = 0.999 (red dashed line), 
R = 0.989 (blue dashed line) and R = 0.978 (green dashed Une). 
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Figure 4. Left panel: Fractional error a^/x vs. S/N. Dashed lines represent the single-tracer case and solid lines the two-tracer case. 
Right panel: cr^*''/(Tj*'' vs. S/N. In both panels the colours show different non-linear cases: R12 = 1-00 (red solid line), R12 = 0.99 (blue 
solid line), R12 = 0.95 (green solid line) and -R12 = 0.90 (orange solid line); Ri = R2 = 0.9. The solid black line is the prefect linear case. 
For dashed lines, R is the corresponding value for the full sample given the above values for R2 and R12 as in Eg. |A4| R = 1.000 
(black dashed Une), R. = 0.900 (red dashed line), R = 0.902 (blue dashed line), R = 0.910 (green dashed Une) and _R = 0.921 (orange 
dashed line). 



3.2| we will address the issue of whether dark matter haloes, as seen in N-body simulations, trace the underlying dark matter 
with a bias that is linear enough for two tracers to be significantly advantageous compared to one. 



2.6 Effect of off-diagonal noise terms 

So far we have assumed that the noise matrix in Eq. |43] is diagonal; but any source of stochasticity in addition to Poisson 
sampling of two disjunct set of objects will add extra contributions to the noise matrix which are non necessarily diagonal. 
Here we explore how a^/x and (t^*''/(tJ''" change for a non-zero off-diagonal noise term. For simplicity we discuss the linear 
bias case with bi = 1 and 62 = 2 and the same number of objects for each tracer. Here, for direct comparison with previous 
examples, we will still set the diagonal elements of the noise matrix by the number density of the tracers (as if it was Poisson), 
and since Y = 1, Nn — N22, but we allow 7Vi2 7^ 0. Note however that in any realistic application, a process that adds 
non-zero off-diagonal noise terms will also increase the diagonal matrix elements. This should be kept in mind during the 
following discussion. 

In Fig. [5] we show how the error a^/x (left panel) and the ratio of errors o-'^^^/cri^^ (right panel) change with S/N for 
different values of A'^12: from the reference case A'^12 = (black-solid line) to 7Vi2 = A^n (pink-solid line|^ The black-dashed 
line represents the one-tracer case, which is not affected by changes in Ni2- 

We see a different behaviour in the high signal-to-noise regime (logfS'/TV] ^ 0) and in the low signal-to-noise regime 
(log[S'/A'^] < 0). In the high S/N regime, the higher the off-diagonal noise, the lower the error a^. This can be understood if 



^ Note that, the maximum value for N12 is \/ N22N11, as can be deduced from the Cauchy-Schwarz ineguality: |(eie2)P ^! (^i)(f2)- 
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Figure 5. Loft panel: Fractional error (Jx /x vs. S/N. Right panel: o-^tr/^ltr g/j^^ p^j. ^^^^^ panels N12/N11 = (black line), 
N12/N11 = 0.4 (red line), N12/N11 = 0.8 (blue line), N12/N11 = 0.9 (green line) and Afi2/A^ii = 1.0 (pink line). The dashed hne on the 
left panel corresponds to the one-tracer case which is not affected by Ni2- 
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Figure 6. Left panel: Fractional error (Tx /x vs. N12/N11. Right panel: (t2*'-/o-1"' vs. N12/N11. For both panels \ogS/N = -0.5 (black 
line), log5/Af = 0.0 (red line), logS/N = 0.5 (blue hne), logS/N = 1.0 (green hne), logS/N = 1.5 (pink line) and logS/N = 2.0 (orange 
line) . 



we imagine the oflt-diagonal noise term as a correlation term between the noise terms. The more correlated the noise of the two 
tracers, the less the total noise; for high values of knowing A'^n means knowing N22- In the low S/N regime we observe 
the opposite behaviour: the higher the higher the error. We also observe that when the value of is very close to Nu, 
then ax decreases abruptly. In the low signal case, adding a non-perfect correlation between noise terms just means adding 
more noise, and only in the case this correlation between noise terms is nearly perfect {N12 ~ Nu) means an improvement in 
the measure. 

In Fig. |6] we show the fractional error of x (left panel) and the ratio of errors between two- and one-tracer case (right 
panel) vs. N\2/N-i_i for different S/N regimes: from log(S'/A'') = —0.5 (black line) to \og{S/N) = 2.0 (orange line). Here, the 
same effect is observed. For high S/N (orange, pink and green lines), increasing N\2 decreases the error, whereas for low S/N 
(black, red and blue lines) the error increases. Here, the effect on the error when N\2 — >■ N\i can be seen more clearly. 

On the right panels of Fig. [5] and [g] for low signal-to- noise and for non-zero values of Nx2, we have that a^'^ /a]^'^ > 1. 
This effect is due to the fact that we are using Eq. |A12| to relate the noise elements of the two-tracer case with the single-tracer 
case. However Eg. |A12| only provides a correct relation among the diagonal noise matrix elements when the off-diagonal noise 
terms are zero, which is no longer the case. In fact the noise for a single tracer built out of two tracers for which the noise 
matrix is strongly non-diagonal is not strictly Poisson. Therefore it cannot be fully described by the Poisson noise that it 
would have if the two tracers had a diagonal noise matrix (as Eq. A12 assumes). However, we have no other way to model 
it, and we thus stick to Eq. |A12| This effect is important only in the low S/N regime and where A'^12 is comparable to the 
diagonal terms {N\2 < A'^ii) 



2.7 Dependence on the relative number density 

Finally it is also interesting to see how the relative number of tracers can affect the error of x. We vary the ratio between the 
number densities of tracers, n\ and fi2, namely Y = ni/n2, keeping the total number of tracers, n fixed. Again, we assume 
linear bias, Poisson noise and that the biases are b\ = 1 and 62 = 2. 
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Figure 7. Left panel: ctx/x vs. S/N. Right panel: (T^*'"/a-^''" vs. S/N. In the left panel the dashed-lines are the errors for the one-tracer 
model and the solid-lines for the two-tracer model. For both panels Y = 1/20 (pink line), Y = 1/10 (green line), Y = I (black line), 
y = 10 (red line) and Y = 20 (blue line), fei = 1 and 62 = 2 are assumed. 



In Fig. [7] (left panel) we show the error of x vs. S/N for Y — 1/20 (pink line), Y = 1/10 (green line), Y = 1 (black line), 
y = 10 (red line) and y = 20 (blue line), for the one-tracer model (dashed lines) and for the two-tracer one (solid lines). In 
the right panel we show the ratio (j^*'"/cr^*'' vs. S/N for different values of Y using the same colour scheme. 

Note that y > 1 means that the highly-biased tracer has the lower number density; for y < 1 the highly-biased tracer 
has the higher number density. From Fig. [7] (left panel) we see that the error on the one-tracer model is as expected from Fig. 
[2] an increase in Y causes a reduction in ax/x because it is equivalent to reducing the effective bias b (recall that 61 < 62). On 
the other hand, for the two-tracer model we observe that the fractional error is lower if the low bias tracer is more abundant 
than the high-bias one. This is also expected because as we have seen in Fig. [2] the lower the bias the smaller is the error 
of X. In Fig. [7] (right panel) we see that the maximal improvement for the two-tracer approach compared to the one-tracer 
approach is realised when the two tracers number densities are equal, independently of the signal-to-noise ratio. For unequal 
number densities, the two-tracer approach gives a better improvement over the one-tracer approach if the number density of 
the highly biased tracer is lower than that of the tracer with lower bias. 



3 EXPECTED VALUES FOR PARAMETERS DESCRIBING BIAS AND STOCHASTICITY 

As we have seen, the improvements achievable by using two tracers, depend on various features of the tracers population, 
such as the signal-to-noise, the bias parameters and the amount of bias non-linearity. In the next two sections, we explore 
what are plausible and realistic values if dark matter haloes are taken to be the tracers. We first use analytical arguments, 
and in the next section, N-body simulations. 



3.1 Extended Press Schechter approach 



In this section, we identify dark matter haloes with the peaks of an initially Gaussian field, and compute their number densities 
and biases. We assume that the tracers (haloes) are linearly biased, as the non-linear corrections to halo bias derived in this 
frameworks are very small. The volume effect that may arise in this formalism is due to the 2-dependence of the parameters, 
in particular the bias. A narrow-deep survey has a strong z-dependence in the bias and a wide-shallow has a very weak one. 
It is because of this, that in this section we assume a volume-limited cubic survey of comoving side 1 Gpc/h for all z, and we 
perform the analysis at different values of z. This way it is easier to understand how the 2-dependence of the bias affects to the 
errors of the one- and two-tracer model. We set kmin = 27r /K!^^ and k„,ax = 0.1D{0)/D{z)Mpc/h. P°{k) is given by CAMB 
for a standard A.CDM universe. We consider a range of redshift between and 4, and we parametrise the redshift dependence 
of / as, f{z) — Q.m,{z)'^ , with 7 = 0.56. We choose the two tracers to be haloes of masses 10^^ Mq /h < M < lO^^Af©//! for 
tracer 1 and IO^A/q/Zi < M < IQ^^Mq/H for tracer 2. 

The number density of these haloes is related to the halo mass function, given by 



n(M, 



2p„ 



do{M) 



dM 



(50) 



Mg{M)- 

where o[M) is the rms of the power spectrum linearly extrapolated aX z = Q filtered with a top-hat sphere of mass M, pm is 
the mean density of the Universe (we set it at pm = 7 x lO'^'^Af0/i~^Mpc~'^) and v = 5sc{z)/(J^ (M). Here, Ssc(z) denotes the 

1.686/D(z). We use the |Sheth fc Tormen| fl999| ) mass function: 

^./M = ^(P) + (£)'^%xp(-gi//2) (51) 



critical threshold for collapse and is given by Ssc{z) 



© 0000 RAS, MNRAS 000, 000-000 



12 Gil-Marin et al. 



20 




01234 01234 
z z 

Figure 8. Left panel: the bias as a function of redshift for tracer 1 {10^'^ M^/h < M < IO^^Mq/H solid line), tracer 2 (lO^^ Mq/Zi < 
M < IO^^Mq/Zi, dotted line) and for the whole sample (black-dashed line). Right panel: S/N for the full sample (see text for definition) 
as a function of redshift. 
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Figure 9. Left panel: Ox/x vs. z for the one-tracer case (dashed line) and for the two-tracer case (solid line). Right panel: cr^*'"/(Tj*'' vs. 



where p = 0.3, q 
objects is, 



0.75 and the normalisation factor A{p) = [l + 2"Pr(l/2 - p)/r(l/2)l . Thus, the number density of 



M2 



(52) 



n{Mi,M2,z) = / n{M,z)dM 

J Ml 

Assuming Poisson noise, we can directly relate this to the noise matrix elements, 

N,,{z) = l/n{Mu,M2^,z) . 

As long as we are considering Poisson noise, the ofT-diagonal terms of the noise matrix are 0. 

For the bias dependence we assume a linear bias {b — b = b). The bias of an object of mass M at redshift z is given by 
| Kaiser|1984l|Mo et al.|1997[ [Scoccimarro et al.||2001[ ). 



(53) 



b{M,z) 



1 + 



1 



D(z) 



Sscjz) ]_ 

'(72 (A/) S,4z) 



The bias of a set of objects with masses between Mi and M2 is given by. 



J^''dMniM,z)b{M,z) 



b{Mi,M2,z) 



J Ml 



(54) 



(55) 



n{Mi,M2,z) 

In Fig.[8](left panel) we show how the bias of tracer 1 {10^^ Mq/H < M < IO^Mq/Zi red-solid hne), tracer 2 (IO^A/q/Zi < 
M < 10^* Mq / h blue-solid line) and the combined sample (black-dashed line) change as a function of redshift. Bias increases 
with redshift roughly as 1/D^{z). As there are many more lower- mass haloes, the overall bias is dominated by this population. 
In Fig. [s] (right panel) we show the signal-to-noise ratio as a function of redshift. As before, we define S/N{z) = b'\z)P°{k = 
0.1)D^ {z){ni{z) + n2{z)), where b{z) corresponds to the bias of the whole sample. Note that b'^{z)P'^{k — Q.l)D^{z) goes 
roughly cx 1/D{z)'^ but the number density of objects (the mass function) drops exponentially rapidly: the S/N decreases 
with increasing redshift; the maximum value of S/N is at 2 = 0, where S/N ~ 15. 

In Fig. [9] we show the errors a^/x corresponding to the one- and two-tracer case (left panel) and its ratio (right panel) 
as a function of the redshift. In Fig|9](left panel) we see that there is a minimum in the value of (j^/x at z ~ 1.5 both for the 
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one- and for the two-tracer cases. This seems paradoxical (at least for the one-tracer model) because we have seen in section 
2.4| that as we increase the bias, the error on x increases. Furthermore, shot noise increases with redshift. The explanation of 
this seemingly paradoxical effect has to do with the fact that not only the noise and the bias change with z, but also kmax{z) 
and x{z). Recall that x{z) is our signal and that the fractional error per interval of k goes like the square root of the number 



of modes, i.e. kf,^ax ~ kfnin- It turns out that, as we increase the available range of k of Eq. 49 the error a^/x is reduced, 
scaling as these considerations indicate. 

The last effect dominate at low z. At higher z, the noise and the bias effect dominate, and therefore a^/x increases with 
z as we have seen in section 2.4 In Fig. [o] (right panel) we see that only at low redshifts (z < 1), where the signal-to-noise is 

2tr /_ltr 



high, the improvement between the two cases is significant, reaching the minimum value of erf 7(7^"' ~ 0.88 at z = 0. This 
result may seem worse than we have shown in the last section for S/N ~ 15. However, taking into account that now the ratio 
of biases is not 2, but ~ 1.5 and also recalling that now Y 1, the fractional error increases from ~ 0.7 to ~ 0.88. 

From Fig. [9] it is clear that if the survey is dominated by low-redshift objects (wide-shallow surveys), then splitting 
the sample may reduce the error of x. However, the higher precision in measuring x is reached with surveys whose volume 
configuration results in most objects having z ~ 1.5) 

A mass selection of haloes as here may provide a modelling applicable to SZ-selected clusters or, in an idealised way, to 
LRG galaxies. But other type of surveys may select tracers in a radically different way; for example, emission-line-selected, 
blue galaxies have a bias that evolves with redshift much more slowly than considered above. We therefore also consider a 
complementary, yet still highly idealised, way to select haloes and explore whether in that case the improvement in splitting 
the sample can be much larger. We select tracers by their peak-height, i.e. their ly of Eq. |50| keeping the maximum and 
minimum v (rather than the mass) of each tracer sample constant in redshift. We have explored different cuts and found that 
i) the improvement in splitting the sample is maximised when the two samples have comparable number of objects. When, 
to maximise the bias difference between the two samples, the highly-biased tracer include objects that are very rare, the shot 
noise for that sample become important and the gain in splitting the sample decreases, ii) By suitably choosing the i/ cuts, we 
have sampled the parameter space and managed to achieve erf ' /(j"'^ ~ 0.6, but we have not been able to improve the gain 
further. For instance, choosing as tracer 1 structures with f between 0.9 and 1.5 and as tracer 2 structures with v between 
1.5 and 20, we reach a gain of 0.6 at z ~ 2. 

Before any practical application of these findings one should bear in mind that we have assumed a volume-limited sample 
(i.e. that all haloes in the required range are detected). In addition we have considered a fixed survey volume seen at different 
redshift: for a given sky coverage the volume per unit redshift increases wirth redshift for z < 2.5. Finally we have selected 
haloes in a very idealised way, in practice, the selection will be likely applied on galaxies, which halo occupation distribution 
is not straightforward. We will discuss this further in i|4] 



3.2 Simulations 

As we have seen in Fig. |3] and [4] the gain of splitting the sample is dependent on the non-linearity parameters, specially for 
high S/N. In this case, Rs have to be very close to unity to exhibit substantial gain. In addition we have so far relied on the 
Poisson sampling assumption, which may not hold in details. For tracers that can be identified with dark matter haloes, these 
issues can be addressed by looking at N-body simulations. 

We choose a flat ACDM cosmology with cosmological parameters consistent with current observational data. More 
speciflcally, the cosmological parameters of the simulation are an = 0.27, = 0.73, h = 0.7, flbh^ = 0.023, n, = 0.95, 
and as — 0.8. Our cosmological simulation consists of 1024'^ particles in a volume of (lGpc//i)^. This results in a particle 
mass of about 7 x 10^" Mq/H. The initial conditions of the simulation were generated at redshift z = 65.67 by displacing the 
particles according to the Zel'dovich approximation from their initial grid points. The initial power spectrum of the density 
fluctuations was computed by CAMB ( Lewis et al.||2000 L 



Taking only the gravitational interaction into account, the simulation was performed with GADGET-2 ( |Springel|[2005l ) 



using a softening length of comoving 30kpc//i and a PM grid size of 2048''. The chosen mass resolution and force resolution 
enable us to resolve haloes with masses above ~ 10^^ Mq/H, i.e. each halo contains at least 15 particles. We identify haloes 
at redshift z = by the Friends-of-Friends algorithm with a linking length of 0.2 times the mean interparticle separation. We 
split the haloes in two mass bins: 10^^ Mq/H < M < 10^^ Mq/H (M12) and M > 10^^ Mq/H (M13). The mass bins M12 and 
M13 consist of about 2.1 and 0.4 million haloes, respectively. 

In order to derive the mean conditional bias ( see Eq. [TJ, 6(5), for the two different tracers, we first compute the halo 
overdensity, Sh, and matter overdensity, S, by assigning the haloes and particles, respectively, on a 512^ grid using the cloud 
in cell (GIG) scheme. The overdensities are then further smoothed by a Gaussian filter, exp(— fc^Z^/S), where we choose the 
smoothing length to be Is =2.5, 5, 10 and 20 Mpc//i. One expects that any stochasticity, non-linear or non-local effects on 
the bias relation should decrease as the field is smoothed with increasing smoothing lengths. The biased density as a function 
of the matter overdensity, b{S)5, is then computed by averaging the Sh in the corresponding S bin (see Eq.[T]). Using the mean 
bias relation so obtained, we can compute the noise field e on the grid by applying Eq. [5] After Fourier transforming the 



© 0000 RAS, MNRAS 000, 000-000 



14 Gil-Marin et al. 





0,05 0,10 0,15 0,20 0,25 0,30 0,05 0,10 0,15 0,20 0,25 0,30 

k (h/Mpc) k (h/Mpc) 



Figure 10. Bias parameters as a function of tiie scale k obtained from simulations, b are the dashed lines and b the solid lines. The black 
lines correspond to the whole sample of haloes, whereas red lines corresponds to sample M12 (lO^^AfQ < M < IO^^Mq) and blue lines 
to sample M13 (M > The smoothing length is 2.5 for the top-left panel, 5.0 for the top-right panel, 10.0 for the bottom-left 

panel and 20.0 Mpc//i for the bottom-right panel. 



different fields using the same 512"' grid, we can compute the power spectra and cross power spectra of the different quantities 
by spherical-averaging the product of their Fourier modes, i.e. for example Ptt(k) — (e(k)e*(k)) and Phm{k) = ((5;, (k)5*(k)). 
Note that in what follows, in evaluating the bias and R parameters from the simulations we have used explicitly Eqs. |3H35[ 
the noise terms are not assumed to be Poissonian but are computed directly from the e field. 



In Fig. 10 we show the bias parameters 6 (dashed lines) and 6 (solid lines) vs. k, for the whole sample (black lines), for 
M12 (red lines) and for M13 (blue lines), for different smoothing lengths. 

The vertical dotted line marks k — 0.1/i/Mpc, which is the typical scale at 2: = where non-linearities appear. In this 
paper, we always work in the linear regime, where k < 0.1D{0)/ D{z) h/Mpc. From these plots we can see that b is robust 
to changes in Is and also is approximately scale independent, b is also very close to be scale invariant and varies little with 
the smoothing scale. In order to have a numerical reference, b changes by about 7% and b by about 5% compared with their 
values for Is = 2.5Mpc//i, over a range k,nin ~ 0.01 fe/Mpc to kmax ~ 0.1 h/Mpc. For Is = 20Mpc//i, both parameters change 
by about 7% in the same fc-range. On the other hand, for k ~ 0.1/i/Mpc, b changes by about 3% as Is goes from 2.5 to 20.0 
Mpc/h; b changes less than 1% in the same range. 



In Fig. 11 we show the values of the non-linearity parameters vs. k obtained from simulations: R{k) (black line), Ri{k) 



(red line), R^ik) (blue line) and R\2(k) (green line). Different panels correspond to different smoothing scales Is, as in Fig. 10 
All non-linearity parameters decrease as the scale decreases, but this trend disappears as we increase the smoothing length. 
When the smoothing length is large enough (~ 20 Mpc/h), all R parameters are approximately scale-invariant and very close 
to 1. 



In Fig. 12 (left panel) we show the different noise components obtained from simulations vs. k, for different smoothing 
lengths: 2.5 (red lines), 5.0 (blue lines), 10.0 (green lines) and 20.0 Mpc/h (orange lines). The solid lines are (ei(k)ei(k)) 
(M12), the dashed lines {e2(k)e2(k)) (M13) and the dotted lines the cross terms between the two samples, (ei (k)e2 (k)) . The 
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Figure 11. Non-linearity parameters, K (black line), Ri{k) (red line), R2{k) (blue line) and Ri2{k) (green line), obtained from simulations 
for different smootfiing lengths, as in Fig. |10[ The subscripts 1 and 2 refer to M12 sample and M13 respectively, whereas the no subscript 
refers to the whole sample. The vertical dotted line, k = O.lh/Mpc, marks the limit of linear regime. The horizontal dashed line marks 
the maximum value for all Rs. 



noise terms for the whole sample are not shown for clarity. In order to get a better comparison between all these noise terms, 
we have removed the effects of the smoothing, dividing each noise by the filter squared. 

The black lines are the noise predictions assuming a Poisson-like noise ({ei(k)e*(k)) — l/fii). The solid line is for the 
M12 sample and the dashed line for M13. The cross term is relatively small (black dotted line). We observe that the M13 



noise is sub-Poisson whereas the one for M12 is super-Poisson. This is in agreement with the findings of Seljak et al. (20091 



It has been noted before, (Smith et al.||2007[) that for massive haloes the noise could be sub-Poisson. At scales smaller than 



the ones of interest here. Smith et al. (20071 ascribe this to halo-exclusion effects. Noise above the Poisson level is expected if 
other sources of stochasticity affect halo formation. The formation and evolution of dark matter haloes is a highly complicated 
process: dark matter haloes grow through a mixture of smooth accretion, violent encounters and fragmentation. In the classical 
extended Press Schechter/excursion set theory haloes are identified with initial density peaks and the computation of the 
halo mass function (and thus as a derived quantity the halo bias) is mapped into a first passage process in the presence of a 
sharp barrier. This yields a deterministic halo bias, but cannot capture the full physical complications inherent to a realistic 
description of halo formation. In addition, numerical simulations show that there is not a good correspondence between peaks 



in the initial density field and collapsed haloes (see Katz et al. ( 19931; Seljak & Warren (2004|). Recently Maggiore & Riotto 



( 2009 1 proposed to include these effects, at least at an effective level, by taking into account that the critical value for collapse 
is not a fixed constant but itself a stochastic variable. This will naturally lead to an extra "noise" component in the halo bias. 

In the Fig. 12 (right panel) we show the signal component, Pmm{k), obtained from simulations for different smoothing 
lengths. The colour notation is the same as that in left panel. In this case, the black line is the linear theory prediction for the 
same cosmological parameters used in our simulations. The effect of the sampling variance can be clearly seen at large scales. 
The enhancement of the clustering at small scales due to the nonlinear gravitational evolution is hidden by the smoothing of 
the density field. 
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Figure 12. Noise terms (left panel) and power spectrum (right panel) as a function of the scale obtained from simulations. In the left 
panel, {ei{k)e'^{k)) (M12) solid lines, {e2{k)e2{k)) (M13) dashed lines and (ei (fc)e2 (fc)) dotted lines. The black lines are the prediction for 
a Poisson-like noise. In the right panel, Pmm{k) from CAMB (black line) and Pmm{k) from simulations (colours lines). In both panels, 
the colours refer to the smoothing length, 2.5 (red), 5.0 (blue), 10.0 (green) and 20.0 Mpc/h (orange). 



Is in Mpc/h 2.5 5.0 10.0 20.0 



b 


1.05 


1.05 


1.05 


1.05 


bi 


0.97 


0.97 


0.97 


0.97 


b2 


1.47 


1.47 


1.47 


1.47 


i?,(fc) 


1.0 - 0.39208fc 


0.99737 - 0.26854fc 


0.99666 - 0.1585fc 


0.99587 - 0.01972fc 


Ri{k) 


0.99939 - 0.4033fc 


0.99591 - 0.31089fc 


0.99567 -0.18733fc 


0.99479 - 0.02141/c 


R2(k) 


1.00752 - 0.29446fc 


1.00145 - 0.22144fc 


0.99941 - 0.11259fc 


0.99876 - 0.02276/c 


Ri2{k) 


0.99922 - 0.16926fc 


0.99883 - 0.12419A: 


0.99922 - 0.06535fc 


0.99905 - 0.01082fc 


a 


1.250 


1.325 


1.675 


2.10 


ai 


1.365 


1.428 


1.785 


2.100 


02 


0.680 


0.680 


0.760 


0.840 



Table 2. Parameters values used in plots of Fig. |13| as function of the smoothing scales used. 



In order to apply the findings from simulations to our model we make the following assumptions: 
(i) The bias is scale-independent and does not change with the smoothing scale. We take b = 1.05 for the whole sample, 



10 



bi = 0.97 for M12 and 62 — 1.47 for the M13 sample. This assumption is supported by Fig. 

(ii) The non-linearity parameters have a linear dependence with scale k. For each smoothing length we fit the best linear 
relation up to A: = 0.1/i/Mpc. 

(iii) The noise is Poisson-like. Therefore the diagonal terms of the noise are Nii{k) = aiW'^{k ■ ls)/ni, where oii is a parameter 



that takes into account the deviations from the ideal Poisson noise (see Fig. 12 left), and W{k ■ Ig) is the smoothing filter. 
The a values used are shown in Table [2] According to the number of haloes of the two tracers and the volume of the 
simulation we have that n = 2.5 x 10~^ /i^/Mpc^, ni = 2.1 x 10"^ /i^/Mpc^ and n2 = 4.0 x 10"'* ft^/Mpc^. 

(iv) We take the off-diagonal noise terms to be zero. This has some support from the simulations (Fig. [12] left panel) where 
this term is N12/VN11N22 < 0.2 

(v) Finally we assume that the cross correlation terms between the e field and the matter field are 0. 

We summarise all these assumptions in Table [2] 

Applying these conditions yield the results shown in Fig. |13[ In the left panel the fractional errors for one- (dashed line) 
and two-tracer model (solid line) are shown as a function of the smoothing scale Is , and the ratio of these two errors is shown 
in the right panel. 

The improvement between the two cases under these assumptions is rather modest; a2tr/o'itr — 0.9. This is because it is 
mainly dominated by the ratio of biases -we have 62/&1 — 1-5- if we are in a region where S/N ~ 10. This result is robust to 
small changes in the bias modelling, e.g. perfectly linear bias. We also have tried to fit the off-diagonal term to some non-zero 
value but we have found that doing this does not produce any significant change in the plots of Fig. |13| 

There may be some merit in splitting the sample in a different way; by choosing samples with very different biases, the 
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Figure 13. Left panel: error of a; as a function of the smoothing length Is - The dashed line is for one-tracer model whereas the solid line 
for the two-tracer one. Right panel; ratio between the two models. 



gains should be larger, but in practice to do this almost certainly requires one sample to be of rare objects, which will have 
very high shot noise. 



4 DISCUSSION & CONCLUSIONS 



We have revisited the method of circumventing sample variance in the measurement of / — d\n D/d\na (D being the linear 
growth factor), based on comparing the clustering properties of two differently-biased tracers of the dark matter distribution. 



This method was recently investigated by McDonald & Seljak (20091, although a similar technique in a different context was 



presented in Pen (20041. Along the same lines, Slosar (20091 and Seljak (20091 propose to compare clustering of differently 



biased tracers to circumvent sample variance in the measurement of primordial non-Gaussianity. 

Most of the statistical power of these measurements comes from very large scales, where cosmic variance is the dominant 
contribution to the statistical error-bars. By suppressing cosmic variance, this approach promises to reduce drastically error- 
bars on cosmologically very important quantities; for example it would allow for a high-precision determination of growth of 
structure as a function of redshift, as encoded in fD, and an improvement of dark energy figures of merit by large factors. 

All these approaches assume that the observed objects (i.e., galaxies) trace the dark matter deterministically; the galaxy 
density field is assumed to be proportional to the dark matter field with the constant of proportionality given by a single 
parameter, the bias. This goes under the name of the linear bias model. An important underlying assumption is therefore 
made, that there is no stochasticity between the tracer field and the dark matter field on the scales of interest, which is 
expected to breakdown at some level, at least on small scales. While the linear bias model has been extremely successful in 



cosmology (e.g., Reid et al. (20101 and references therein), it is well known that the linear bias model might provide a good 
description for the galaxy power spectrum even if the relation between the galaxy and dark matter overdensities is not that 



of a linear bias (e.g.. Heavens et al. (1998l). 

Galaxies are believed to form inside dark matter haloes, but their formation probability as function of halo mass and 
their exact radial distribution are still the subject of active research. The process of halo occupation by galaxies is expected 
to be stochastic to some extent, but the details of the galaxy distribution within haloes is expected to become increasingly 
unimportant on large scales. Here we simplify the issue by assuming (possibly with an over-simplification) that dark matter 
haloes can be used as tracers. The linear deterministic bias model however is known not to be a perfect description of halo 
clustering and that the relation between dark matter and haloes and between haloes of different masses is stochastic. For 
example, Seljak & Warren (20041 point out that "the fluctuations between haloes and the initial or final matter fields are 



never below 10-20 per cent" and that "the scatter between the fields in individual modes is significant and one cannot assume 
that the fields are simply proportional to one another". This was further explored and quantified by |Bonoli &: Pen] ( |2009[ ). 
Note that the halo overdensity field is expected to have a stochastic component even if it was a perfect Poisson sampling of 
a linearly biased dark matter field, but the above references and N-body simulations show that there are additional sources 
of stochasticity beyond shot noise. 



We have thus set out to generalise the approach of McDonald & Seljak ( 2009 1, by assuming that the bias of haloes may not 



be perfectly linear and allowing for some stochasticity. We have computed the expected error on the quantity fD achievable 
by comparing clustering of differently biased tracers (thus suppressing cosmic variance) and by combining the different tracers 
in a single sample (thus reducing shot noise and stochasticity but carrying along sample variance in full). 

We have analysed how the bias, the noise, the non-linearity and stochasticity affect the measurements of fD and explored 
in which signal-to-noise regime it is significantly advantageous to split a galaxy sample in two differently-biased tracers. We 
used results from simulations to set plausible values for these parameters to see how great the gains may be in practice. We 
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find that even small amount of stochasticity (either in the form of Poisson noise or in more general form) and of non-linearity 
can limit significantly the performance of the two-tracers approach. In our analysis wc also have assumed a scale-independent 
bias. This may be enough, for the mass range studied, if we consider only dark matter haloes as tracers. This is indeed what 
we have seen in simulations. On the other hand, it is also true that more realistic approaches, which account for galaxies as 
tracers instead of haloes, should include a scale-dependent bias. However, including this in our formalism can only reduce the 
gain achievable by splitting the sample. We expect the ratio of errors increases from the current value of 0.9 to even closer 
values to 1 if the bias is strongly scale dependent. 

We have shown that only in the very high signal-to-noise regime it is significantly advantageous to split the sample and 
that, even though the gain is maximised by increasing the ratio of the biases of the two tracers, both tracers should be well 
sampled. We have explored different ways of selecting and splitting dark matter haloes obeying a ACDM mass function and 
found that one can achieve up to a 40% reduction of the error on fD. While this would correspond to the gain from a three 
times larger survey volume if the two tracers were not to be split, it is much smaller that the improvement forecasted in the 
absence of stochasticity and bias non-linearity. 

In addition wc should note that these findings apply to dark matter haloes as tracers, while realistic surveys would select 
galaxies: the galaxy-host halo relation is likely to introduce extra stochasticity which would reduce the gain further. The 
formalism we have developed, however, is general enough that can be used to optimise survey design and tracers selection 
and optimally split (or combine) tracers to minimise the error on the cosmologically interesting quantities. 
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APPENDIX A: RELATIONS BETWEEN ONE-TRACER CASE AND TWO-TRACER CASE 

The parameters R{r), Ri{r), R2{r) and Rii{r) are not fully independent, the same is true for the set of parameters P{r), 
f)2{r) and /3i(r), and also the different noise matrices. Here we make explicit their relation. 

Al Relation between the R parameters 

Let the tracer (galaxies, haloes...) overdensity 5g(yL) be defined as, 

(Al) 

Pa ng 

where Pg{x) is the tracer density at x. In the second equality we have used the fact that Pg(x) oc ng(x) with ng(x) the number 
of tracers at x. The total number density of galaxies is ng{Ti) = ngi(x) -|- ng2{x). Defining the ratio of number of galaxies as 

y = ^ (A2) 
n2 



we can write the overdensity of galaxies as. 



+ (A3) 



Recalling the definitions of R{r), Rrzir), Ri{r), R2{r) we find that 



^(^^ ^ -Ri(r)y/32(r) + fi2(r)/?i(r) ^^^^ 



^Pl{r)Y^ + M{r) + 2i?i2(r)y^i(r)^2(r) 



Note that as expected when y — >■ 0, R{r) — >■ R2{r) and when F — ^ oo, R{r) — > Ri(r). 

This last equation is also valid in fc-space according to the definitions of f3i(k), bi{k), Ri{k) and i?i2(fc). This is because 
the definitions of all these parameters are mathematically symmetric in Fourier and configuration space. 

A2 Relation between the $ parameters 

Similarly, we can derive the relation between P{r), Pi{r) and P2{r). Recalling the definition of Ps as, 

Hr) ^ J- m = ^ (A5) 

hiir) b{r) 
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then, the relation between /3s is, 



Mr) = Pir) 



bUr) 



1/2 



From this we obtain 



+ 



(A6) 



(A7) 



+ (i + F)2 (i + y)(i + i/F) ■ 

Again, this relation is valid for both configuration space and fc-space because of reasons of symmetry in the definitions of the 
parameters. Clearly we can also write down a relation between the b parameters: 



bi 



+ 



+ 



2i?i2&l&2 



+ + (l + F)(l + l/y) 



(A8) 



A3 Relation between the noise terms 

Let the the noise matrix for one-tracer model be 

and for the two-tracer model, 



N2t 



Nur = N 



Nil Ni2 

Ni2 N22 



(A9) 



(AlO) 



both in fc-space. 

Assuming Poisson noise terms, and distinct populations, the off-diagonal terms are zero, A''i2 = 0. Setting ni and n2 to 
be the number density of galaxies of type 1 and 2 respectively, and n = fii + n2 the total number of galaxies, we can say that 

(iVii)-'=ni (Af22)"'=n2 {N)-^=n. (Afl) 

Finally we obtain, 

Nil = N {1 + N22^N{l + Y). (Af2) 



A4 Constraints between the non-linearity parameters 

Given the definitions of the non-linearity coefficients, using the Cauchy-Schwarz inequality we find that 

- f ^ R{r),Ri{r),R2{r),Ri2{r) sC 1. 



(A13) 



However, the negative values for these Rs parameters represent a negative bias for tracers relative to the dark matter, with a 
doubtful physical connection. For this reason we restrict the possible values for these parameters to the range 



SC R{r),Ri(r),R2{r),Ri2{r) sC 1. 



(A14) 



The parameters: Ri{r), R2{r) and Ri2{r) are not totally independent, but are related by the condition of Eq. A32 

l-Rl{r)-Rl{r)-Rl2{r) + 2R^{r)R2{r)R^2{r)>0. (A15) 
If we isolate Ri2(r) as a function of Ri{r) and R2(t) the last equation becomes 

Ri{r)R2{r) - ^jRi{r)RUr) + 1 ~ Rl{r) - R^{r) sC Ri2{r) ^ Ri[r)R2{r) + y'j??(r-)7?i(r) + 1 - Rl{r) - Rl{r). (Af6) 
This equation is enough if we are working only with the two-tracer model. However, if we want to compare this model 



with the one-tracer model, we have to make sure that also the R{r) parameter is between and f (see Eq. A4 1 

Ri{r)Y~P2{r) + R2{r)Pi{r) 



s; 1. 



(A17) 



lpl{r)Y^+Pl[r) + 2Ri2[r)YPi[r)P2{r) 
Since Ri{r), R2{r), Y and l3i{r) are always positive, the first inequality always holds, and we find that 

{Ri {r)YP2{r) + Pi{r)R2{,r)f - y"/3|(r) - /3?(r) 
2Yh{r)P2{r) 
to satisfy the second. 

This minimum value could be lower or higher than the one given by the Eq. |A16| depending on the values of the other 
parameters. 



Ri2(r) 1? 



(A18) 
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Therefore, the hmits for Ri2{r) are, 

I ^ 2F/3i(r)/?2(r) J 

i?i2(r) s; Ri{r)R2ir) + ^ Rj{r)Rl{r) + 1 - i?2(r) - 7?2(r) (A19) 
As we said before, also this last equation is valid for fc-space parameter because of reasons of symmetry in the definitions. 



APPENDIX B. CONSTRAINTS ON THE NON-LINEAR COEFFICIENTS 

Suppose there are 3 possibly correlated fields, x, y, z (in our application these would correspond to (Jji, 5g2, (Jm) and the 
corresponding non-linear coefficients are, 

2 _ {Xzf 



(A20) 



„2 _ {yzf 



^ WW) ^"-''^ 



Using the Cauchy-Schwarz inequality we can state that, 

s; r? ^ +1 (A23) 
We want to know what are the constraints on the triplet ri, r2 and ra. To solve this, consider 

C={{x + \y + tizf)'^Q. (A24) 
This is at least zero for all A and /i, and in particular for the values which minimise C, namely A' and /i': 
dC 



dC 



= {{x + >!y + ^l'z)y) = 0, (A25) 



„ = {(x + A'j/ + m'^)^) = 0. (A26) 

(A27) 

The system has an unique solution if and only if 5C 1. In that case the values are, 

A' = ^{{yz){xz)-{xy){z^)) (A28) 
1 

Id 



^ {{yz){xz) - (xy)(r^^ 

= ^{{y'){xz)-{xz){yz)) (A29) 



(A30) 

where D — {y'^){z^) — (yz)^. Substituting these values into Eq. A24 for C gives 

(1 - rl)(l -~rl- tI ^rl + ^r^r^r,:) ^ 0. (A31) 

Provided that 7^ 1 we can write 

\-r\-rl-rl^ 2rir2r3 > 0. (A32) 

Because of symmetry reasons we can say that this last equation holds if at least one of the rs is different from ±1. Note that 
if two of the rs are equal to one, so is the third. In our application ri, r2, ra correspond to R\,R2, Ri2- 
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